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Abstract 

We prove upper and lower bounds and give an ap- 
proximation algorithm for the cover time of the 
random walk on a graph. We introduce a param- 
eter M motivated by the well known Matthews 
bounds on the cover time and prove that M/2 < 
C = 0(M (lnlnn) 2 ). We give a deterministic poly- 
nomial time algorithm to approximate M within 
a factor of 2; this then approximates C within a 
factor of O ((In Inn) 2 ), improving previous bound 
of O(lnn) of Matthews. 

The blanket time B was introduced by Winkler 
and Zuckerman: it is the expectation of the first 
time when all vertices are visited within a constant 
factor of number of times suggested by the sta- 
tionary distribution. Obviously C < B, and they 
conjectured B = 0(C) and proved B = O(Clnn). 
Our bounds above are also valid for the blanket 
time, and so it follows that B = 0(C(lnlnn) 2 ). 

1 Introduction 

Given a connected graph G on n vertices, for a 
vertex i £ V(G), C(i) denotes the cover time 
of the usual random walk on G, starting from 
i; that is, C(i) is the expectation of the num- 
ber of steps a random walk starting from i takes 
until it covers all vertices of G. The quantity 
C = maXj g v( G ) C(i) is called the cover time of 
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G. (See for background.) 

Although C is a basic notion in the theory of 
random walks, there is no effective way known 
to compute this parameter, given the adjacency 
matrix of G as the input. The following question 
has been open for several years Q. 

Question. Is there a deterministic algorithm 
which approximates C up to a constant factor in 
polynomial time ? 

The requirement that the algorithm is determin- 
istic is crucial and this makes the problem dif- 
ficult. It is simple to provide a randomized algo- 
rithm which approximates C within a factor (1 + e) 
for any positive constant e, with high probability: 
just simulate the chain and take the average of the 
empirical cover times. 

Prior to this paper, the best approximation fac- 
tor we knew of was Inn. This factor can be 
achieved using the following fundamental result of 
Matthews ||. For any pair of vertices i,j € V(G), 
H(i,j) denotes the hitting time form i to j. We 
set 

hmax = max H(i,j), h min = min H(i,j), 

i,j£V i,j&V 

and more generally, for every set S C V, we let 



h s = min H(i,j). 

• J6S 



Let har(n) = Ya=i V*- 

Theorem 1.1 (Matthews' theorem) For any 

G, 

/i m inhar(n) < C < /i max har(n). 

More generally, for any subset S C V(G) with 
\S\>2, 

h s hax(\S\) < C. (1) 
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It follows from the upper bound in Matthews' the- 
orem and the definition of the cover time that 

Vax <C< /i max har(n). 

Thus, /i max approximates C within a factor of 
har(n) ~ Inn. Moreover, since the H(i,j)'s are 
quite easily computable in polynomial time, /i max 
is computable in polynomial time. Unfortunately, 
hmax. can be equal to the cover time (as shown by 
a path), as well as a factor of Inn off the cover 
time (as shown by a complete graph). 

We could try to use the lower bound (||) in 
Matthews' Theorem, by maximizing over S. Un- 
fortunately, this can be even worse than 7i max . For 
example, if G consists of a single edge with N loops 
added at one of the nodes, then the best Matthews 
lower bound is 1, while the cover time, starting 
from either the node with the loops or from the 
stationary distribution, is about N. This prob- 
lem is easy to fix: just throw in the obvious lower 
bound /i max - More precisely, let Mq be the maxi- 
mum of h max and the quantities /igln|5| (S C V, 
\S\ > 2). Then Mq is a lower bound on C, and (as 
we'll see) it is only a (lnlnn) 2 factor off. We call 
Mq the augmented Matthews bound. 

A parameter closely related to the cover time 
is the blanket time, introduced by Winkler and 
Zuckerman ||. The definition provided below is a 
little bit stronger. 

Definition. Consider a random walk starting 
from a node v. Let vt )V { x ) be the number of 
visits to x up to time T. Let B be the first time 
T when the ratio TT \ l Vj v ' 1 is at most 2 for any two 
nodes i and j (in particular, all nodes are covered 
by this time). Let B(v) be the expectation B. The 
blanket time B is the maximum of B(v), over all 
vertices v. 

It is clear that C < B. Winkler and Zuckerman 
conjectured that there is a constant K so that B < 
KC, and showed that 

B = O(Clnn). 

The main goal of this paper is to improve the fac- 
tor O(lnn) in both problems mentioned above to 
0((lnlnn) 2 ). 

The following variant of the augmented 
Matthews bound Mq is at the heart of our study. 



Let K(i,j) = H(i,j)+H(j,i) be the commute 
time between i and j. For any S C V(G), let 
k s = min ije5 K(i,i), and 

M= max KQlnlSI. 

SCV(G) 

As the following proposition shows, M and Mq are 
essentially equivalent, but due to the symmetry of 
At, M will be easier to handle. 

Proposition 1.2 

-M < M < M. 
8 ~ 

Our main theorem is the following. 
Theorem 1.3 For every graph G on n vertices 
^M <C <B < 10 5 M(lnlnn) 2 

(Of course the lower bound C > M/8 follows from 
Proposition 1.2 and Matthews' bound.) 

It follows from this theorem that M approxi- 
mates C within a factor 0((lnlnn) 2 ) and B < 
XC(lnlnn) 2 , for some constant K. It turns out, 
somewhat surprisingly, that both the upper bound 
and the lower bound are sharp up to a constant 
factor. 

The proof of the lower bound will give a some- 
what stronger result. Let C(ir) = Yli^i^if) de- 
note the cover time when the walk is started from 
a random node from the stationary distribution 

7T. 

Theorem 1.4 For any graph G, 

C(tt) > \m. 

An important property of M as an approxima- 
tion of the cover time is that it is efficiently ap- 
proximable: 

Theorem 1.5 M can be approximated within a 
factor of 2 by a deterministic polynomial algo- 
rithm. 
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The rest of the paper is divided into five sec- 
tions. In Section |], we describe an algorithm 
which computes M up to a factor of 2, proving 
Theorem 1.5. In Section |3], as preparation for the 
proof of the lower bound in Theorem L3, we de- 
rive some formulas for the cover time, which may 
be interesting in their own right. In Section 0, we 



complete the proof of Theorem 1.4, and also prove 



Proposition 1.2. The proof of the upper bound in 



Theorem 1.3, which is the most substantial part 
of this paper, follows in Section ||. In the final 
Section ||, we give constructions which show that 
both the upper and lower bounds in Theorem [L 
can be attained. 



2 Approximating M 

Since the commute times n{i } j) are polynomially 
computable, the quantity ks is also polynomially 
computable for any set S C V(G). However, the 
definition of M involves all (exponentially many) 
subsets of V(G) and it is not clear that one can 
compute M in polynomial time. In the following, 
we show that one can, at least, approximate M to 
within a factor of 2 in polynomial time. 

A preliminary remark: the commute time 
satisfies the triangle inequality: 

n(i,k) < K(i,j) + K(j,k), 

and hence we can consider it as a "distance" on 
the graph. 

Algorithm. To start, pick an arbitrary vertex 
v\. At the i th step (i = 1,2, ...,n), we have se- 
lected the set Vi = {v±,. ■ ■ ,Vi}. Choose Vi + \ to be 
a vertex v £ V\Vi whose distance min u6 ^ k(u,v) 
from Vi is maximum. Compute Mi = Ky.lni for 
all i = 2, 3, . . . ,n, and output M' = maxj Mj. 

Since n(i,j) are polynomially computable, our 
algorithm runs in polynomial time. Moreover, 
M' < M by definition. It remains to show that 
2M' > M. 

Assume that M is attained at a set S C V(G) of 
cardinality s. We claim that 2M S > M. It suffices 
to show that K,y s > Kg/2. 

Let R = S\V S . If R is empty then S = V s and 
we are done, so we assume that \R\ = r > 0. By 



the description of the algorithm, Ky a = K,(v s ,Vj) 
for some j < s. 

For each vertex x G R, there is a vertex y x € 
V s -i so that K,(x,y x ) < Ky a - If y x £ S for some 
x, then ks < K(x,y x ) < Ky a , and we are done. 
If y x £ for all x £ R, then (using that 

| V s -i \S\ = (s — 1) — (s — r) = r — 1 < r) the pigeon 
hole principle gives that there are x and x 1 in R so 
that y x = y x i = y. So by the triangle inequality 

k(x,x') < n(x,y) + K(y,x') < 2kv s - 

By definition k(x,x') > k$ and the proof is com- 
plete. 

Remark. The only property of the commute 
times we use here is the triangle inequality. There- 
fore, our result holds in a more general setting. 
Consider a metric wona finite set V of n points. 
For any subset S C V, let ws = minjj g stt>(i, j) (if 
5 has less than 2 elements, ws = 0). Define 

W = wsxw s f(\S\), 

where / is any non-negative function defined on 
the set of non- negative integers. 

Corollary 2.1 For any finite metric space and 
any non-negative f, the above algorithm (with 
replaced bywij) computes W within a factor 

of 2. 

3 Formulas for the cover time 

Fix a set S C V, \S\ = s > 2, and a starting node 
v. For a given random walk (v = v°,v 1 ,v 2 , . . .), 
and a set T C S, let Z(T) denote the set of nodes 
of S not seen before T is first reached. Thus T C 
Z(T). Define, for i,j G S, 



A(i,j) 



i£j€Z(i)\{i}, 



IZ(i)|(|Z(i)|-l)> 

0, otherwise, 



(this number depends on the walk) and let 
a(i,j) = E[A(i,j)]. We have 



ieSjeS 



ies jez(i)\{i} 



|Z(i)|(|Z(f)|-l) 



y — 



i:\Z{i)\>\ 



Z(i) 



1 1 1 

2 + 3 + - + ? 
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and thus 



EE«( 

ieSjeS 
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Using this notation, we can state a formula for 
the expected number C(v,S) of steps until all 
nodes of S are visited. The basic idea here is 
similar to that in Mathhews' theorem. 

Lemma 3.1 

C(v,S) = -Y,H(v,j) + J2J2 H ^3Hhj)- 



Proof. Let X k be the number of steps required 
to see k nodes of S. Clearly C(v,S) = E[X S ]. Let 
T(i) be the number of steps required to see node i. 
The following algebraic identity is easy to verify: 



1 s 

= 

s ^— ' 

k=l 

+ E 

Kk<m<s 



1 



(s-k)(s-k + l 



■(X m — X k j3) 



Now here 

E [X>]=£ E ™] = £^( 



V,l) 



k=l 



For the second sum in (|3|), we fix the first X k steps, 
then 

(X m -X k )= (T(j)-T(v x ^), 

m:k<m<s jeZ(v x k) 

and hence 

E[ ( X ™-Xk)] = £ H(v x \j). 

m:k<m<s j£Z(v x k) 

Summing over k, we get 

s-l 



£ (a -fe)( 8 1 -fc+i ) £ 

*=1 V A J j€Z(v x k) 



E 



(|Z(t)|-l)|Z(*)| 



Taking expectation again, we get the lemma. □ 
For i, j € S, let 

1 



Q(*,j) 



|z(tf)|(|z(ij)|-i) 



and q(i,j) = a(i,j)+a(J,i) = E[Q(i,j)]. Using the 
identity 

H(ir,j)-H(ir,i)=H(i,j)-H(j,i). (4) 
due to Tetali and Winkler 0, which implies that 

H(i,j) = l K (i,j) + l(H(irJ)-H(ir,i)), (5) 
a simple computation gives the following lemma: 
Lemma 3.2 

C(v,S) = -Y(H(v,j)-H(7T,j)) (6) 



+ 7££«(*\i)«(*,j) 



(7) 
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Let q n (i,j) De the expectation of q(i,j), when 
the starting node t> is chosen at random from the 
stationary distribution. Averaging over v, the first 
term in @ cancels, and we get 

Corollary 3.3 

C(tt,S) = ^ ££«(», 

4 Proof of the lower bound. 

Proof of Theorem [1.4| This follows easily from 



Corollary 3.3 and (H): 



C(tt) > C(tt,S) = 

> 7«s££?»r(*J) = -n s ln\S\. 



Proof of Proposition 1.2. It is obvious that for 
every S C V 

fc s ln|5| < J . mm «(i,j)ln|5| < ~M, 
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and for every i,j G V, Proof. As usual, we first estimate E[e XXi ] for 

A > 0. Taylor expansion gives 

H(iJ)< K (i,j)<M. 

Hence M <M. E[e XXt ] = 1 - AEpQ] + (X 2 /2)E[Xfe~ x * Xi ] 

To prove the other bound, let 5 be the set at- < exp(-AE[A"j] + (X 2 /2)E[Xf]) 

taining the maximum in the definition of M. If 

h max > M/4, then we have nothing to prove, so for some A * between and A. Since 
suppose that H(i,j) < M/4 for all i and j. 

We define a digraph D on 5 as follows. There E[X 2 ] = m 2 P[X; 

is an edge from i to j if and only if H(i,j) < m =\ 
k(z,j)/4. In this case, it is clear that H(j,i) — 00 
H(i,j)> K (i,j)/2>K S /2. < a^2m 2 (l- P y 

Let ioii ...i ro be a (directed) path of length m m=1 
in D. Then by the cycle law Q we have 



oo 

m\ 



m—1 



ff(Wo) > X) , »i) - JT(*|, > mns/2. y m 2n_ p) m = (1 ~p) + (1 -p) 2 < 2(1 -p) ^ 
«=o p 3 F 3 

m=l 

it follows that 



On the other hand, H(i,j) < Ksln|5|/4 for all 
i,j G 5. This implies that m < ln|5|/2. A the- 
orem of Gallai implies that D is ln|5|/2 col- , a(l— p)A 2 
orable, and therefore D contains an independent E[e *J — ex P ( — ^E[Xj] H 
set / of size at least 2|5|/ln|5| > I5] 1 / 2 . By the 
definition of D, Therefore 



H{i,j) > /s(i,j)/4 > K S /A E ^_ A (X-E[X])j = -Q E ^-A(X l -E[X l ]) 



e 



for any i,j G J. Therefore 



/afc(l-p)A 2 

111 < exp v / ; 

nanH{i,j)hx\I\ > -rH S -ln\S\ = -M. V p3 

rj Taking A = p 3 L/(2(l — p)ak), we have 



Ppf-Epf] < -L] < E[e 



-A(X-S[X]+L)i 



5 Proof of the upper bound , / xr , ak(i-p)\< 

11 < exp — AL-I 5 

We need a Chernoff type large deviation inequal- , p 3 L 2 

ity, which will be shown using fairly standard ar- = ex P ^ ~~ 4{\—p)ak 

guments. 

□ 

Lemma 5.1 Let Xi,...,Xk be independent non- 
negative integer valued random variables with 

._, Lemma 5.2 Let i and j be two nodes and k > 1. 

P[Xi = m <a(l-p) m Vm>l r , T „ , ,, i t*- ■ u j j. ~* j 

L * / — ^ e ^ e number of times j had been visited 

for some numbers a > and < p < 1. Let w/ien * was i/ie ^ me - Then for every 

X = T,i=l X i- Then for any L > e > 0, 



3 r 2 



P[A--E[X] < — L] < exp( -— ). P W k < (l-e)^k 



4(l-p)akJ' L 7Tj J \47Ti«(i,j) 



< exp 



-e 2 /fc 



5 



Proof. Let us restrict the Markov chain to i and j 
only. It is well known that we get a time-reversible 
Markov chain with transition probabilities 



1 



i 



TiK(ij') 

I' J' = TT jK {iJ) I'jJ 

and stationary probabilities 



1 



7T; 



7T; 



7Tj + TTj 



TT; 



TTi+IT 



We may consider this very simple Markov chain 
to prove the lemma. 

Define X k to be the number of visits to j during 
the A: th return trip from i to itself, that is, 



w, 



k+1 



It is clear that the Xj are i.i.d. with 



P[Xi 
P[Xi -- 



0] 



m 



PijP^Pji 

7Tj 



Clearly W fc = + • • • + X k . Thus 



E[W k ] =kE[X 1 ] = k^. 



Applying Lemma 5T with a = PijPji(l—fiji 



p = pji and L = e—k, we obtain 



7T,- 



P[W fc < (1-eHfc] 



7T," 



E[X] < -e^fc 

7T ? ; . 



< exp 



^jPijPjik 



exp 



47T i K(z,j / 



□ 



Proof of the upper bound in Theorem 
1.3. Consider the ordering («o> ^i, . • . ,u n -i) of the 



nodes of G as obtained by the Algorithm in section 
||. For convenience, relabel the nodes by (1, . . . ,n). 
Recall that each i > 1 is a node farthest away from 
the set {1, ... ,i — 1} in distance k. 

For each node i > 1, let i' be a node with i' < y/i 
and K,(i,i') minimal. Clearly, the edges ii 1 form a 
tree T. We consider 1 as the root of the tree. 
It is also clear that the depth d of T is at most 
1.51n Inn. 



Our next observation is that 

, 2M 

) < -7—- 

mi 



(8) 



Indeed, let S = {1, . . . , [V*J + 1}. Then, by the 
definition of M, 

M 2M 
S ~ In | SI lni ' 

and hence there exist nodes u,v S S with k(u,v) < 
2M/mi. We may assume that u < v. By the 
choice of the ordering, there exists a node j < 
v — 1 < y/i such that K(i,j) < k(u,v). It follows 
that 

, , i. , , 2M 

Mi,l ) < K(U,V) < . 

lni 

Set e = l/(81nlnn) and T = [400M/e 2 ]. Our 
next goal is to bound the probability that B > T 
for some T >Tq. 

Set F(i) = rT{i) / {T-Ki). On the average, F(i) = 
1. If the event "B > T" occurs, then there exists 
an edge ii' of T with one of the following proper- 
ties: 

(A) F(i') > 0.9(1 + In i')~ £ and F(i) < 0.9(1 + 
lni)~ e ; 

(B) F{i') < 1.1(1 + \ni'f and F(i) > 1.1(1 + 
lni) 6 ; 

(C) F(i') < 0.9(1 + In i') £ and F(i) > 0.9(1 + 
lni) £ ; 

(D) F(i') > 1.1(1 + In i'f and F(i) < 1.1(1 + 
lni) e . 

Indeed, if B is larger than T, then there exists 
a node u such that either F(u) > y/2 or F(u) < 
l/y/2. Suppose that e.g. the second occurs. We 
assume that n > 10, to exclude some trivial com- 
plications. Then F(u) < 0.9(1 + ln-u) -e . We also 
know that there is a node v with F(v) > 1 > 
0.9(1 + lnw) e . If F(l) > 0.9, then along the path 
from u to 1 there is an edge with property (A). If 
F(l) < 0.9, then along the path from v to 1 there 
is an edge with property (C). 

We call such an edge "bad" . To bound the prob- 
ability that an edge is bad, we have to bound the 
probabilities of (A), (B), (C) and (D) separately. 
This is very similar in all cases, and we give the 
details for (A). Let k = [0.9(1 + In z / ) _e 7r i /T] , and 
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consider the step when i' is reached the k-th time. 
By (A) , the number of times we have seen i is 



W k < 0.9(1 + lniy e TTiT < 



1 + lnz 
1+lm' 



7T, 



7Ti' 



■k 



< 2- £ —-k < (l--)--k, 

7T,7 V 4/ 7T,-/ 



and hence by Lemma p,2j , 



PL4] < exp 



1007Tj'K(i,i') 



Now here 



A; 1 
— > 0.9(l + lni') _e T > -T 

mi 2 

and hence by (||), 

p [A ] < exp ( Zflhl) < r T £ V(200M)_ 

The probability that this happens for some edge 
ii! is at most 



j- Te 2 /(200M) < 2 .2- T£2 /(200J\/) 

i=2 

(using here that T > To) and hence the probability 
that B > T is at most 8-2- Te V(200M)_ Thus 

OO OO 

E[B] = ^ P[B > T] < T + ^ P[B > T] 



< T + 8 Y, 2- T£2 /( 200M ) <2T , 

T=T 



which proves the theorem. 



□ 



Remark. We may prove the upper bound using a 
slightly different approach. Let So be the set of all 
vertices and inductively define Si to be a maximal 
subset of such that K$ i > K>i := 2*M/logn. If 
such a subset does not exist, Si consists of a vertex 
in Si-i and the construction stops. Since < 
M unless |<Si| = 1, this procedure stops within 
O (In Inn) steps. The advantage of this approach 
is that we may have a better upper bound if the 
procedure stops earlier. For example, if G is a 
complete graph, then the construction stops after 
1 step. More generally, for each x E Si\Si + \, take 



a vertex y G Si + \ with n(x,y) < Kj+i- This is 
possible since SVt-i is a maximal subset. Regarding 
the pair xy as an edge, this gives a tree with depth 
at most O (In Inn). Let / be the minimum possible 
depth. Then the same proof would yield B = 
0{l 2 M). 

6 The sharpness of Main Theo- 
rem 

In this section we show that both the lower bound 
and upper bound in the Main Theorem |ll^ are 
sharp, up to a constant factor. More exactly, 
we give an example where B and C are of order 
G(M) and also one where B and C are of order 
9(M(lnlnn) 2 ). 

The proof for the lower bound is easy: for the 
complete graph on n vertices, all three parameters 
B,C and M are 0(nlnn). 

The construction to match to upper bound is 
more complicated. It is a tree of depth d defined 
as follows. The root is at level 1. Each vertex at 
the i th level has 2 2 * children, and the edge between 
the mother and a child has multiplicity 2*. 

The number of vertices in the i th level is 

N i = fj2 2J =2 2 '- 2 . 
i=i 

The number of the vertices in the whole tree is 

d d 

^ = E^ = E 22l " 2 = iV ^ 1+o ( 1 ))- 

i=l i=l 

The number of edges between the i th and (i + l) th 
level is E { = 2 { N i+1 = 2 T+i ~ 2 . The total number 
of edges is 

d-l 
i=l 

Notice that d = O(lnlnn). We first show 

m = e(E). 

It is well-known that the commute time between 
two vertices x,y in a tree (possibly with multiple 
edge) is 



l-l 

m{ Xj x j+1 ) 



0) 
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where x = x ,...,x t = y is the path connecting x 
and y, and m(vw) is the multiplicity of the edge 
vw. For a lower bound, consider the set 52 of 
(four) vertices in level 2. The commute time is 2E 
for any pair (by @, which gives M > 21? In 4. For 
an upper bound, let S be a set of size at least 
2. Then take the maximum level i such that 
there is a vertex in level i having at least two 
descendants (including itself) in S. Since the mul- 
tiplicity of an edge geometrically increases as the 
level increases, (||) implies that the commute time 
of a pair who has a common ancestor in level i 
is at most 0(E/&o), especially k s = 0(E/&o). 
Moreover, since no pair has a common ancestor 
in level i + 1, the number of vertices in S be- 
low level i is at most iVj +i. Trivially, the num- 
ber vertices of S above or in level i is at most 

Eti°^ = o(N iQ+1 ). Thus \S\ = (l + o(im o+1 
and 

K S ln\S\=0(E). 

In the rest of this section, we shall omit floors 
and ceilings, for the sake of a clearer presentation. 

Claim 6.1 The cover time of this tree satisfies 
C = Q(Md 2 ). 

Proof. It suffices to show that for a sufficiently 
large constant K, a walk of length T = d 2 E/K, 
starting from a stationary point, covers the tree 
with probability at most 1/2. 

To start, set k = 10 In In d and define a sequence 
bi as follows 



b k = d 2 /^KM = k-i(l-\J -/fei-i), 

for all i > k. Let I be the first index such that 
bi < 1/2. Arithmetic shows that if K is sufficiently 
large then I < d—1. Set dj = 2 l bi and mi = 2 2% , 
a simple calculation shows 



cii = 2ai-i - y -ctj-ilnmj-i. 

Let Xi denote the minimum number of times a 
multi-edge from level i to level i+ 1 is crossed in 
a finite walk. We say that a walk is a Tj-walk if 
it stops when Xi = and denote by A{ the event 
that a Tj walk covers the tree. Furthermore, let 



B be the event that a walk of length T satisfies 
Xk > ak- Notice that 

P( A walk of length T covers the tree) 

< P(B) + P(^ fc ) 

The expectation of the number of crosses of any 
multi-edge between the k th level and the (k + l) th 
level is 2 k T/E = 2 k d 2 /K, where 2 k is th multiplic- 
ity of the edge. On the other hand, a k = 2 k d 2 /yK 
by definition. Therefore, by Markov's inequality 
P(B) is at most 1/yK < 1/3. To finish the proof, 
we show that P(Ak) = o(l). Observe that for any 
i > k, P(Ai) is upper bounded by 

P( a Tj-walk satisfies Aj + i > a,i+i) + P(Ai+i). 

It follows that 
l-i 

P(^4fc) < X^' 3 ^ a ^*" wa ^ satisfies Xi + % > aj + i) 

i=k 

+P(a T^-walk covers the tree). 

To show that P(^4fc) = o(l), it now suffices to 
prove that 

l-i 

P( a Tj-walk satisfies X^i > aj+i) = o(l) 

i=k 

(10) 

and 

P(a T;-walk covers the tree) = o(l). (11) 

It will be useful to think about the walk using 
a "balls and urns" model. Consider a vertex u on 
level i. Attach to each neighbor of u. Any time 
we exit node u, drop a ball into the corresponding 
urn. Then balls will be dropped into the urns 
independently, so that the urns corresponding to 
the children of u have the same probability, and 
the urn corresponding the parent of u has half this 
probability. Conversely, if for each node, we decide 
how to drop balls into the urns, then we determine 
a unique walk. It is important to notice that the 
number of times an edge is crossed depends only 
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on the ball distributions corresponding to nodes 
above the edge. 

Assume that the multi-edge between u and its 
parent v is crossed x times; then the numbers of 
crossing of the multi-edges going down from u is 
the same as the number of balls in the big urns at 
the moment the small urn has x balls. 

Using the balls and urns terminology, (|l(]) fol- 
lows from the following lemma. 

Lemma 6.2 Assume that a and m are large num- 
bers, and a' = 2a — \ \ahim > 0. Drop balls 



into one small urn and m big urns until the small 
urn has a balls, then with probability at least 
1 — ^exp(— ln 2 / 3 m) + exp(— m 1 / 2 )^ , one of the big 
urns has at most a' balls. 

Proof. We use the following fact which is easy 
to prove. If X is sum of i.i.d. binary random 
variables and X has large expectation fj,, then for 
any yjji < L < \i 



exp(-2L 2 //u) < P(X < n-L) < 2exp(-L 2 /2/i). 

(12) 

To prove the lemma, we first show that with 
probability at least 1 — exp(ln 2/,3 m), at the first 
moment when the small urn has a balls, the num- 
ber of balls dropped is at most A = a(2m + l) + 

4m y/ a ln 2 / 3 m. To show this, it is enough to prove 
that if one drops A balls randomly into one small 
urn and m big urns, then with probability at least 
1 — exp(— ln 2//3 m) the small urn has at least a 
balls. The number of balls in the small urn can be 
expressed as a sum of A i.i.d. binary random vari- 
ables and has expectation /i = A/(2m+l) = a+L, 

where L = 2 Valn 2/3 m + o(l). The claim follows 



directly from the upper bound in (12), with room 
to spare. 

To finish the proof of the lemma, we show that 
if we drop A balls into one small urn and m big 
urns, then there is a big urn with at most a' balls 
with probability at least 1 — exp(— m l / 2+ °^). The 
number of balls in a fixed big urn is a sum of A 
i.i.d. binary random variables and has expectation 
A/(m+l/2). Set V = A/(m + l/2)-a'; it is clear 

that V = (1 + o(l))-v/ iolnm. We say an urn is 



"good" if it has at most a 1 balls and "bad" other- 
wise. By the lower bound in (^), the probability 
that a fixed urn is "good" is at least 



p = exp [-2L' A /(A/(m + l/2))) >m 



-1/2 



So the probability that an urn is "bad" is at 
most 1—p. Observe that the events "urn U\ is 
bad" and "urn XJi is bad" are negatively corre- 
lated, for any two fixed urns U\ and Ui. Using 
FKG inequality and induction, we can show 

P( all m urns are "bad") < {l-p) m 

< (l-m 1 / 2 )" 1 < exp(-m 1/2 ), 
concluding the proof. □ 



Now ( JTOj ) follows from the previous lemma and 
the fact that 



z-i 

i=k 



In 2 / 3 m,j) +exp(- 



-rn 



l/2+o(l)> 



o(l) 



Here we need to use the condition k = 10 In In d. 

To prove (|TT|), it suffices to prove show that if 
one drops balls into one small urn and mi = 2 2 
big urns until the small urn has a; < 2 l /2 balls, 
then with probability at least 1 — o(l), there is an 
empty big urn. Similar to the proof of Lemma 
6.2 , one can show that at the time when the small 
urn has a; balls, with probability 1 — o(l), at most 
Saimi balls have been dropped (the constant 3 is 
generous). To conclude, we show that if we drop 
Ai = 3aimi balls into mi identical urns, then with 
probability 1 — o(l), there is an empty urn. Since 
ai < 2 l /2 and mi = 2 2 ' +1 , A\ < |m;lnm^, the 
claim follows by a standard coupon collector ar- 
gument. □ 
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